Partitioning Uncertain Workflows

نویسندگان

  • Bernardo A. Huberman
  • Freddy Chong Tat Chua
چکیده

It is common practice to partition complex workflows into separate channels in order to speed up their completion times. When this is done within a distributed environment, unavoidable fluctuations make individual realizations depart from the expected average gains. We present a method for breaking any complex workflow into several workloads in such a way that once their outputs are joined, their full completion takes less time and exhibit smaller variance than when running in only one channel. We demonstrate the effectiveness of this method in two different scenarios; the optimization of a convex function and the transmission of a large computer file over the Internet.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Collaborative Data-centric Workflows: Towards Knowledge centric workflows and Integrating Uncertain Data

The acquisition of data, in particular for scientific data, is more and more organized in complex processes that are captured by workflows. These workflows are often driven by ontologies. For example the collaborative application Spipoll [3] proposes to collect information about pollination in France. The users take pictures of insects on flowers, download them on the application and then ident...

متن کامل

Partitioning and Scheduling Workflows across Multiple Sites with Storage Constraints

This paper aims to address the problem of scheduling large workflows onto multiple execution sites with storage constraints. Three heuristics are proposed to first partition the workflow into sub-workflows. Three estimators and two schedulers are then used to schedule subworkflows to the execution sites. Performance with three real-world workflows shows that this approach is able to satisfy sto...

متن کامل

A Bayesian Approach to the Partitioning of Workflows

When partitioning workflows in realistic scenarios, the knowledge of the processing units is often vague or unknown. A naive approach to addressing this issue is to perform many controlled experiments for different workloads, each consisting of multiple number of trials in order to estimate the mean and variance of the specific workload. Since this controlled experimental approach can be quite ...

متن کامل

Density-Based Clustering Based on Probability Distribution for Uncertain Data

Today we have seen so much digital uncertain data produced. Handling of this uncertain data is very difficult. Commonly, the distance between these uncertain object descriptions are expressed by one numerical distance value. Clustering on uncertain data is one of the essential and challenging tasks in mining uncertain data. The previous methods extend partitioning clustering methods like k-mean...

متن کامل

Remodelling Scientific Workflows for Cloud

In recent years, cloud computing has raised significant interest in the scientific community. Running scientific experiments in the cloud has its advantages like elasticity, scalability and software maintenance. However, the communication latencies are observed to be the major hindrance for migrating scientific computing applications to the cloud. The problem escalates further when we consider ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1507.00391  شماره 

صفحات  -

تاریخ انتشار 2015